Identification of Differentially Expressed and Prognostic lncRNAs for the Construction of ceRNA Networks in Lung Adenocarcinoma

Background Long noncoding RNAs (lncRNAs) could function as competitive endogenous RNAs (ceRNAs) to competitively adsorb microRNAs (miRNAs), thereby regulating the expression of their target protein-coding mRNAs. In this study, we aim to identify more effective diagnostic and prognostic markers for lung adenocarcinoma (LUAD). Methods We obtained differentially expressed lncRNAs (DElncRNAs), miRNAs (DEmiRNAs), and mRNAs (DEmRNAs) for LUAD by using The Cancer Genomes Atlas (TCGA) portal. Weighted gene coexpression network analysis (WGCNA) was performed to unveil core gene modules associated with LUAD. The Cox proportional hazards model was performed to determine the prognostic significance of DElncRNAs. The diagnostic and prognostic significance of DElncRNAs was further verified based on the receiver operating characteristic curve (ROC). Cytoscape was used to construct the ceRNA networks comprising the lncRNAs-miRNAs-mRNAs axis based on the correlation obtained from the miRcode, miRDB, and TargetScan. Results Compared with normal lung tissues, 2355 DElncRNAs, 820 DEmiRNAs, and 17289 DEmRNAs were identified in LUAD tissues. We generated 8 WGCNA core modules in the lncRNAs coexpression network, 5 modules in the miRNAs, and 12 modules in the mRNAs coexpression network, respectively. One lncRNA module (blue) consisting of 441 lncRNAs, two miRNA modules (blue and turquoise) containing 563 miRNAs, and one mRNA module (turquoise), which consisted of 15162 mRNAs, were mostly significantly related to LUAD status. Furthermore, 67 DEmRNAs were found to be tumor-associated as well as the target genes of the DElncRNAs-DEmiRNAs axis. Survival analyses showed that 6 lncRNAs (LINC01447, WWC2-AS2, OGFRP1, LINC00942, LINC01168, and AC005863.1) were significantly correlated with the prognosis of LUAD patients. Ultimately, the potential ceRNA networks including 6 DElncRNAs, 4 DEmiRNAs, and 22 DEmRNAs were constructed. Conclusion Our study indicated that 6 DElncRNAs had the possibilities as diagnostic and prognostic biomarkers for LUAD. The lncRNA-mediated ceRNA networks might provide novel insights into the molecular mechanisms of LUAD progression.


Introduction
Lung cancer is the leading cause of cancer-related death worldwide, of which lung adenocarcinoma (LUAD) is the dominant histological subtype, accounting for 40% of all cases [1,2]. Statistics show that a dismal 5-year survival rate is less than 20% despite recent advances in therapies [3]. e major factors in unfavorable prognosis of LUAD are diagnosis at terminal cancer and the propensity for metastasis [4]. Hence, there is an urgent need to identify new biomarkers to predict diagnosis and prognosis at an early stage and explore novel therapeutic targets for LUAD [5].
High-throughput genome sequencing and microarrays have indicated that 75% of the human genomes are transcribed into noncoding RNAs with the exception of proteincoding genes [6,7]. Long noncoding RNAs (LncRNAs) are a class of RNA transcripts with a length of more than 200 nucleotides without protein-coding ability [8]. LncRNAs are broadly perceived for their functions in regulating biological processes through different mechanisms in various cancer types and have held substantial promise as novel biomarkers for cancer therapy [9][10][11]. MicroRNAs (miRNAs) have also been confirmed to play an important role in cancer progression over the past decades [12,13]. Intriguingly, increasing evidence supports that lncRNAs act as endogenous molecular sponges that recognize and competitively bind to miRNAs by sharing miRNA response elements (MREs), indirectly regulating target mRNAs at a post-transcriptional level [14,15]. Besides, the hypothesis that the complicated ceRNA networks participate in tumor development has been verified [16,17]. For instance, the lncRNA ITGB8-AS1-miR-33b-5p-ITGA3 axis was reported to promote invasion and migration in colorectal cancer [18]. LncRNA PVT1, as a ceRNA for miR-143, upregulated HK2 expression and promoted proliferation of gallbladder cancer cells [19].
Weighted gene coexpression network analysis (WGCNA) lied in the construction of scale-free gene coexpression networks to identify crucial modules of highly correlated genes that are associated with specific clinical features [20,21]. e advantage of WGCNA is that it can identify and cluster highly correlated genes into the same module. At present, WGCNA plays a significant role in multiple fields, such as cancer, nervous system, and genetic data analysis, which is extremely useful for identifying potential candidate biomarkers or novel treatment targets [22][23][24][25].
In the current study, we identified differently expressed lncRNAs (DElncRNAs), miRNAs (DEmiRNAs), and mRNAs (DEmRNAs) and obtained the key modules relevant to LUAD traits by using WGCNA. Six diagnostic and prognostic DElncRNAs and 6 lncRNAs-4 miRNAs-22 mRNAs ceRNA networks may provide a useful basis for formulating early diagnosis and individualized treatments in LUAD.

Research Process Design.
e bioinformatics scheme design of the study is shown in Figure 1.

Construction of the Weighted Gene Coexpression Network
and Identification of Module Eigengenes. We incorporated RPKM (Reads Per Kilobase per Million) files of lncRNAs, miRNAs, and mRNAs into WGCNA analysis and constructed gene coexpression networks using the WGCNA R package [26]. e process included the following key steps [20,21]: Firstly, the outliers were removed using the abline function for the clustered samples. Secondly, the established similarity matrix was converted into an adjacency matrix based on the β value. On this foundation, a topological overlap matrix (TOM) was constructed which was used to carry out the corresponding dissimilarity, and the hierarchical clustering tree of genes (dendrogram) was generated through hierarchical clustering to implement module detection. Finally, Module Members (MMs) and Gene Significance (GS) were counted and further investigated for module signature genes that were closely associated with cancer progression. e construction process among lncRNA, miRNA, and mRNA coexpression networks was similar with the exception of some parameters: in the selection of soft power values, β values of lncRNAs, miRNAs, and mRNAs were 4, 3, and 1, respectively. e height cutoff MEDiss res of lncRNA, miRNA, and mRNA settings of similar modules was 0.5, 0.8, and 0.4, respectively. In terms of recognizing dynamic modules, 3 kinds of RNAs had the same conditions (deepSplit � 2, minModuleSize � 30).

Survival Analysis.
In combination with clinical information of TCGA-LUAD samples, univariate and multivariate Cox regression analysis were performed using survival R package Coxph function to clarify the relationship between characteristic lncRNAs and overall survival (OS), and forest maps were drawn using forestplot R package for visualization. LncRNAs significantly associated with prognosis were involved in the construction of the ceRNA regulatory networks.
e area under the curve (AUC) for 1-year, 3-year, and 5-year OS was calculated by the 'timeROC' R package to assess the predictive accuracy of prognosis. In addition, diagnostic ROC curves were plotted with IBM SPSS Statistics 26 for the lncRNA signature. P < 0.05 was considered statistically significant.

Identification of DElncRNAs, DEmiRNAs, and DEmRNAs in LUAD.
TCGA-LUAD mRNA expression data, including 534 LUAD samples and 59 normal samples, were downloaded and matched with Genecode v38 for obtaining lncRNA expression data. e expression profiles of miRNAs in 521 tumor samples and 46 normal samples were explored.

Construction of Gene Coexpression Networks to Obtain
Hub Modules. WGCNA, a systematic biological approach, was conducted to certify clinical phenotype in relation to coexpressed genes in networks. Selection of soft threshold power was a critical step in constructing WGCNA. To determine the relative balance between scale independence and average connectivity, we analyzed network topologies with soft threshold power ranging from 1 to 20. When the power value (β) was confirmed to 4 (lncRNAs), 3 (miRNAs), and 1 (mRNAs), the corresponding fitting index reached 0.9, and the coexpression network satisfied the scale-free distribution (Supplementary Figures S1(a)-S1(c)). We generated 8, 5, and 12 key modules (noted by different colors) in lncRNA, miRNA, and mRNA coexpression networks through the dynamic tree cutting method (Figures 3(a)-3(f )). Each module was color coded, but the genes in the gray module did not belong to any other module. Notably, we also identified the relationship of each module with the LUAD phenotype. e results showed that there was a significant association between the blue module and tumor phenotype in the lncRNAs coexpression networks (weighted correlation of module features � 0.78) (Figure 3(b)). Meanwhile, the turquoise module was obviously correlated with tumor characteristics in the mRNAs coexpression networks (module trait weighted correlation � 0.71) (Figure 3(f )). For miRNA coexpression networks, both blue and turquoise modules were significantly correlated with the tumor phenotype (module trait weighted correlation � 0.59/0.56) (Figure 3(d)). e genes in the core module were extracted for further analysis (WGCNA-lncRNAs � 441, WGCNA-miRNAs � 563, and WGCNA-mRNAs � 15162) (Figures 4(a), 4(c), 4(d), and 4(f )).
Next, 10 intersectional miRNAs were predicted by Tar-getScan and miRDB online target gene prediction tools for their target genes (Figure 4(e)). No targeted genes were predicted for miR-142-3p at the TargetScan website, and results of the remaining 9 miRNAs showed that 3074 miRNAs-mRNAs pairs included 2742 target genes; 6121 miRNAs-mRNAs pairs were retrieved on the miRDB website, containing 2742 target genes. ere were 1388 mRNAs that were duplicated in both sites (Figure 4(h)). Finally, 67 target mRNAs were selected from DEmRNAs,  Table S2). We performed reverse inferences based on 67 target genes and received 38 pairs of miRNAs-mRNAs (including 6 miRNAs and 38 mRNAs). e interaction effect between 6 miRNAs and 99 lncRNAs was also concluded at length.

Construction of lncRNAs-miRNAs-mRNAs Networks for LUAD.
When miRNA binds to MRE on lncRNAs, mRNA expression is not inhibited; hence, miRNAs are mostly negatively correlated with lncRNA and mRNA expression (Supplementary Figure S2) [27]. erefore, we screened for negatively associated genes, which included 59 lncRNAs, 4 miRNAs, and 22 mRNAs. Clinical data were downloaded from TCGA-LUAD, of which 512 samples had complete clinical information. Clinicopathological features of pT stage, pN stage, pM stage, and pTNM stage were incorporated into analysis, and the Coxph function in the survival R package was used to perform univariate and multivariate Cox regression analysis (Supplementary Tables S4 and S5.). As a consequence, 6 lncRNAs were identified as crucial prognostic factors (WWC2-AS2, OGFRP1, LINC00942, LINC01168, and AC005863.1 were risk factors, and only LINC01447 belonged to protective factor) (Figures 5(a) and 5(b)) (Supplementary Table S6). Cox regression analysis was performed to obtain risk scores of each sample which were used for ROC analysis of the prognosis classification utilizing the 'timeROC' R package. As shown in Supplementary Figure S3 Figure S3(b)). Eventually, we constructed ceRNA networks for 6 lncRNAs, 4 miRNAs, and 22 mRNAs, that were visualized using Cytoscape v3.7.2 software and an alluvial plot (Figures 6(a) and 6(b)).

Discussion
Due to the unfavorable prognosis and high mortality rate of LUAD, it is necessary to improve the strategy of diagnosis and treatment. e lncRNA-mediated ceRNA hypothesis proposed that lncRNA functions as a ceRNA to regulate the gene expression by influencing miRNA activity. A previous study suggested that lncRNA-KRTAP5-AS1 and lncRNA-TUBB2A could serve as ceRNA to reinforce proliferation, invasion, and EMT function of Claudin-4 [28]. HOXD-AS1 was bound to miR-130a-3p in a competitive manner, which activated the expression of EZH2 and MMP2 and facilitated liver cancer metastasis [29]. Previous studies suggested that lncRNA as ceRNA played an important biological function in LUAD, but the tumor-specific ceRNA networks launched by lncRNAs remained largely unknown [30,31]. Different from lncRNA-regulated ceRNA networks in LUAD established by Wu et al., six distinct lncRNAs were exhibited in our ceRNA networks. e reason may be that different bioinformatics tools and concerns were applied (i.e., we used the WGCNA analysis and conducted Cox regression analysis to identify cancer-related prognostic lncRNAs).
In the present study, we identified 6 differentially expressed and prognostic lncRNAs. Among them, LINC01447, LINC01168, and AC005863. 1   reported to date. It is interesting to explore the biofunctional role in the development and progression of LUAD for these three lncRNAs. e other three were lncRNA OGFRP1, LINC00942, and WWC2-AS2, which were both reported in the field of malignant tumors [32][33][34][35][36]. OGFRP1 promoted tumor progression by increasing the activity of the AKT/ mTOR pathway or directly interacting with miR-4640-5p [32,33]. Recent studies have shown that LINC00942 potentiated breast cancer cell proliferation and progression by affecting METTL14-mediated m6A methylation [34]. WWC2-AS2 and LINC00942 were involved in the construction of a prognostic lncRNA signature in cervical cancer and lung adenocarcinoma [35,36]. In the current study, high expression of OGFRP1, LINC00942, and WWC2-AS2 was associated with poor prognosis of LUAD patients, which was in line with the abovereported results. Nevertheless, all these lncRNAs associated with molecular events still need further experimental validation in LUAD.
Four predicted miRNAs in ceRNA networks stand out in our study.
In the established ceRNA networks, the 22 DEmRNAs attracted the researchers' attention, and they found that they were effective regulators during cancer progression [43][44][45][46][47][48][49]. EPHB2 has been associated with cancer stemness and acquired sorafenib resistance via the β-catenin/TCF1 axis [43]. CXCL5 as a tumor angiogenic factor promoted the expression of FOXD1 by activating the AKT/NF-κB pathway in colorectal cancer [44]. High expression of CDCA7 promoted tumorigenesis and predicted poorer prognosis in patients with TNBC and ESCC [45,46]. Downregulation of TNNC1 (Troponin C1) expression accelerated tumor formation and increased mortality in LUAD patients [47]. Glioma cells with low SYT14 (Synaptotagmin 14) expression were observed to suppress the proliferation capacity [48]. Upregulation of SPOCK2 negatively regulated MMP2 gene expression, which in turn inhibited the invasion and metastasis of prostate cancer cells [49]. ese studies indicated these potent cancer regulators involved in the present ceRNA networks.

Conclusions
We used bioinformatics methods to construct the LUADspecific lncRNA-mediated ceRNA regulatory networks. We also identified 6 DElncRNAs as prognostic biomarkers which might play critical roles in tumorigenesis and development of lung cancer. Further experimental verification   Journal of Oncology is needed to elucidate the underlying regulatory mechanism in the future.

Data Availability
All data generated or analyzed during this study are included in this published article and its supplementary information files.

Conflicts of Interest
e authors declare no conflicts of interest.

Authors' Contributions
Yimeng Cui, Yaowen Cui, and Ruixue Gu contributed equally to this work.

Supplementary Materials
Supplementary Figure S1. Analysis of network topology for soft thresholding powers (weighting coefficient, β) in lncRNAs, miRNAs, and mRNAs. e x-axis represented different soft thresholding powers. Upper: assessment for R 2 of log(k) and log(p(k)) correlation coefficients corresponding to different β values in the network. e red line indicated a scale-free topology fitting index R 2 of 0.9. Lower: analysis of the mean connectivity for various β values. Supplementary Figure S2